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Abstract 

Neural responses are highly variable, and some portion of this variability arises 
from fluctuations in modulatory factors that alter their gain, such as adaptation, 
attention, arousal, expected or actual reward, emotion, and local metabolic re¬ 
source availability. Regardless of their origin, fluctuations in these signals can 
confound or bias the inferences that one derives from spiking responses. Recent 
work demonstrates that for sensory neurons, these effects can be captured by a 
modulated Poisson model, whose rate is the product of a stimulus-driven response 
function and an unknown modulatory signal [1]. Here, we extend this model, 
by incorporating explicit modulatory elements that are known (speciflcally, spike- 
history dependence, as in previous models [2, 11]), and by constraining the re¬ 
maining latent modulatory signals to be smooth in time. We develop inference 
procedures for fitting the entire model, including hyperparameters, via evidence 
optimization [3], and apply these to simulated data, and to responses of ferret audi¬ 
tory midbrain and cortical neurons to complex sounds. We show that integrating 
out the latent modulators yields better (or more readily-interpretable) receptive 
held estimates than a standard Poisson model. Conversely, integrating out the 
stimulus dependence yields estimates of the slowly-varying latent modulators. 


1 Introduction 

One of the great mysteries of neuroscience is how the brain manages to perform stable and use¬ 
ful computations using such apparently unreliable elements as neurons. One can present the same 
stimulus to a sensory neuron over and over again, and the number of spikes it produces will differ 
each time. While some of this variability arises from biophysical processes within neurons (e.g., 
synaptic transmission, diffusion processes within and across membranes), we have also known for 
a long time that neural responses are affected by a huge array of contextual state variables: arousal; 
attention; expected or actual reward; emotions; local metabolic resource availability; and the pres¬ 
ence of stimulants, depressants or anesthetics. In turn, fluctuations in these variables can compound 
the measured variability of neural responses [e.g., 12, 13, 14]. 

Although these sources of variability are well known in the neuroscience community, many of them 
are difficult to control or measure. Experimentalists largely handle this variability by simple aver¬ 
aging. In sensory systems, for example, stimuli are typically presented repeatedly for many trials; 
the results are then averaged, in order to estimate the stimulus-dependent component of the neural 

^Submitted to NIPS 2014 (not accepted). This version has been slightly modified to eorrect some minor 
mistakes; the original submission is preserved as vl in arxiv. 
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response. This approach has several major drawbacks. First, the need for many repeats is very 
resource-intensive. Second, it assumes that the variability can actually be removed by averaging, 
which need not be true. Third, there are no widely agreed-upon means for assessing the contribution 
of these unknown sources, before or after averaging. 

If we wish to study a non-stationary brain, then it behooves us to consider more explicitly how these 
unknown state variables manifest in neural responses. A more explicit model of their effect could, 
in principle, have two major benefits: it might allow us to directly estimate the contextual influences 
on neurons, and it could provide better methods to infer stimulus-response relationships. 

To first approximation, the effect of these state variables is modulatory: they alter the gain or respon- 
sivity of neurons, rather than directly cause the generation of spikes (Fig. 1). This idea has recently 
proven useful as a structural constraint for a model of the desired form. Goris et al [1] describe 
neural responses as arising from a Poisson process whose rate is a product of stimulus drive and a 
latent gain signal. This model provides an accurate account of the variability of sensory neurons, 
attributing a substantial fraction to fiuctuations in modulatory infiuences. Here, we generalize this 
modulated Poisson (MoP) model, providing it with a more explicit stimulus-dependent component, 
and assuming that the unknown state variables driving the gain fiuctuations change only slowly 
in time. We develop statistical inference procedures for extracting both the latent gain signal, as 
well as the parameters of the stimulus dependency. We test these procedures on simulated data, and 
demonstrate the success of the model in explaining electrophysiological data obtained from auditory 
neurons in the anesthetized ferret. 

We note that our formulation has similarities to previously published models. In form, this is an 
extension of the Generalized Linear Model (GLM, see below; [2]), but with the inclusion of a time- 
varying latent signal. Our model shares the general form of a latent state space model for neural 
data [15-19]. This class includes the Poisson Linear Dynamical System (PLDS) model [17]. Unlike 
the PLDS, which relies on autoregressive processes to characterise the latent signal, our formulation 
relies on Gaussian Processes. We also relax the constraint that the latent modulatory signal is passed 
through the same nonlinearity as the stimulus-response function, and rather consider it simply to be 
a positive-valued signal that has a multplicative effect. 

An earlier version of these results was presented at the CoSyNe meeting in 2014. 
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Figure 1: Neurons are subject to modulatory infiuences, which affect their spiking output, (a) Spike 
counts recorded from a real VI neuron, in the anesthetized macaque, from [6]. Stimuli (oriented 
gratings) drive the firing of this neuron, but have dynamics on a timescale of seconds. Modulatory 
infiuences on this cell create slow non-stationarities that are visible in its spiking output, (b) Two 
epochs of low and high firing (labelled A and B). (c-d) the stimulus selectivity of the neuron in these 
two epochs differs by a multiplicative gain factor. 


2 Modulated Poisson Model 

Our central abstraction is to build a generative model of neural responses that consists of three 
components. First, there are the controlled or measured inputs to a neuron. For a sensory neuron, 
this might be encapsulated as “the stimulus”. Second, there a slowly varying stochastic modulatory 
process, that acts as a multiplier on the instantaneous firing rates. This component is not directly 
observed. Third, there is the spike-generating point process (whose output is typically observed). 

For more detail, we begin with the classical Poisson framework for describing the stimulus-driven 
response of a sensory neuron (Fig. 2a). This asserts a simple generative model for producing spike 
counts within a set of time windows, and contains only the input-rate and point-process components. 
We assume here we have a collection of discrete time windows of identical duration, indexed by t. 
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and that the spike rate within each of these windows, //t. is a function, F, of the current stimulus, 
Xt (or, perhaps, the recent history of stimulation, which can be encapsulated in Xt), and a set of 
input-response parameters, k\ 

^it = F{xt\k) ( 1 ) 

In turn, we assume that the spike count within each window, yt, is drawn from a Poisson distribution 
with rate /if We use the vector notation y to describe the set of scalars yt. 

This class includes the Linear-Nonlinear-Poisson (LNP) models (where fit = F{xjk) ), the Gener¬ 
alized Linear Model (GLM), where the input Xf encapsulates a range of known time-varying signals, 
such as the recent history of spike counts from the same and other neurons [2], and the Generalized 
Quadratic Model (GQM), where F also includes quadratic operations on Xt [7]. 

In the Modulated-Poisson (MoP) model [1], we assume that the firing rate is additionally modulated 
by a multiplicative interaction with a latent (unobserved) variable gt : 

IJ-t= 9f F{xt\k) (2) 

We make two specific assumptions about the vector of time-varying modulator values, g. First, we 
assume it is positive, by writing gt = exp(/it)- This choice is not unique: other nonlinear functions 
could also be used [10]. Second, we assume that it is slowly varying. The slow dynamic onh = [ht] 
could take on any form appropriate to a given system (e.g. linear Gaussian, discrete Markov). Here, 
for tractability and generality, we assume that h is multivariate Gaussian: 

h - Ar(0,C(6/)) (3) 

where the smoothness is imparted by C(0), the prior covariance matrix on h, via a set of hyperpa¬ 
rameters 6. This structure is depicted in the schematic shown in Fig. 2b, and the graphical model 
shown in Fig. 2c. Below, we motivate a particular choice of form for C(6). 



Figure 2: (a) The classical Poisson model, and (b) the Modulated-Poisson model for the generation 
of spikes by a neuron. The latter adds a structured stochastic gain modulation, (c) Graphical model 
representation of the MoP-model. 


3 Inference 

Suppose we have a set of neural data with stimuli X = [xt] and observed spike counts y. Our goal 
is to infer the two unknown components of the model: the log-modulators h, and the parameters of 
the input-response relationship, k. In the case where the nonlinearity F is the exponential function, 
these two stages can be merged. For generality, we maintain our central abstraction here, and split 
the inference into separate stages for each of these two components. 

More precisely, we approximate the joint posterior by p{h^6,k\X^y) ^ qk{k\X^y) • 

qh,e{h,6\X^y), and then solve iteratively for the two components. For brevity, we omit the sub¬ 
scripts on q and the conditioning on X and y from the notation, and write these simply SiS q{k) and 
q{h^ 0). Here, we shall remain agnostic to the form of the input-response relationship (F), and thus 
focus entirely on the modulator component. Further, rather than solving jointly for the modulator 
h and its hyperparameters 0, we follow the approach of [3], and split this inference into two parts: 
first, we calculate a maximum-marginal-likelihood point estimate 9 = arg max^ p{X,y\6^k), and 
then we calculate a posterior on h conditioned on 6. We concentrate on the latter problem first. 
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3.1 Inference on h 


Suppose we have an estimate of q{k), and 9, a point estimate of 9. To solve for h given these, we 
begin by writing the terms of the log joint, CJ = logp(/i, k; X, y) that depend on h, and then 
we condition on our current estimates of q{k) and 9: 

= -0^ ex.p{h) + y^h- C{9)~^h + const (4) 

where u is the vector of values Ot = lEg(fc) [F{xt] k)], i.e. the expected stimulus-driven component 
of the firing rate under our current estimate of q{k), and the exponential is applied elementwise. 

The number of elements of h is equal to the number of time windows, T. This is typically very large. 
For example, an hour of continuous recording, with spike counts binned at 25 ms intervals has T 
10^; at 5 ms resolution, this number exceeds 7 x 10^. Performing inference on h scales with 0{T^), 
and is therefore intractable: the objective is expensive to compute, and the number of variables to 
optimize is far too large. We therefore reduce its dimensionality. Our strategy is to project h into 
a low-dimensional space specified by the eigendecomposition of the prior covariance, C. Since we 
want h to be slowly varying, the prior to be translation invariant, and the eigendecomposition and 
projection to be inexpensive, a natural choice is for C to be circulant. In this way, the eigenvectors 
are sinusoids, the prior is diagonalized to a Fourier power spectrum (in this case, low-pass) which 
can be parameterized by 9, the transform can be quickly calculated through the FFT, and inference 
can proceed on the Fourier coefficients of h within the pass-band. 

More specifically, we write: C = LQ as the eigendecomposition; P as the (T* x T) projection 
operator which excises the low-eigenvalue dimensions, i.e. those with low prior power; as its 
pseudoinverse, which imputes missing coefficients as zero; and R = PQ as the combined transform 
to the reduced space. We consider the DFT matrix Q as an invertible real-valued transform, by 
concatenating the real and imaginary parts of the transformed vectors; since h is real-valued, the 
negative frequencies are redundant and can be omitted from the computation. We implement all 
matrix operations that involve Q and R via an orthonormalized, real-valued, real-signal FFT. 

As a result, we perform inference on the length-T* vector h* , corresponding to a reduced Fourier- 
domain representation of the modulator, h. The transform back to the time domain is given by 
h = R^h* = Q^P^h*. This reduces the complexity of inference to 0{T\ogT (T*)^), 
corresponding to the cost of the FFT plus the cost of inference in reduced space. 

In reduced Fourier space, the conditional log posterior on h*, its Jacobian and its Hessian become: 



= e^pfR^h*) +y^R^h* - ^ 

(5) 

dCVh> 

1 

* 

rH 

1 

* 

' ^ 

1 

1 

II 

(6) 

dh* 


= -Rdmg{n)R^ - 

(7) 

dh* dh*'^ 


where /j, = 0 Q ex.p{R^h*) is the firing rate, 0 is the elementwise product, and the diagonal 
matrix L* = RCR^ = PLP^ is the projected prior covariance on h*. Since equation <0 
contains a sum of exponentials, a moment-matched exponential family form for q{h*\9) cannot be 
tractably normalized. However, since yt ^ 0^ the Hessian is convex wrt. /i*, so we can solve for 
a Laplace approximation of the conditional posterior instead, q{h*\9) = M{h*^ A*). In principle, 
this reduced Fourier representation can be transformed back into the time domain, giving q{h\9) = 
M{h, A) = J\f {R^h*, R^A*R), but since A is (T x T), it is typically too large to fit in memory. 
Sampling and conditional inference can nevertheless proceed with q{h*) rather than q{h). 

In order for the FFT to be an efficient tool, the time windows should be regularly sampled. It is 
nevertheless straightforward to handle missing data at time points t', by simply setting = yt' =0. 
In this way, imputed values of ht> do not materially contribute to the likelihood terms in the objective 
and its derivatives, and are constrained only by the smoothness and shrinkage terms from the prior. 
This also provides a way of avoiding artefacts from circular boundary conditions imposed by the 
Fourier transform: we pad h with an additional T values, for which the responses are unobserved. 
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Finally, the first term of the Hessian is expensive to compute directly, even via the FFT implementa¬ 
tion. However, we note that this matrix is a selection of rows and columns from —Q diag {/j,) , 

which, in turn, is a real-valued rearrangement of the real and imaginary parts of a complex-valued, 
Hermitian, circulant matrix, with entries constructed from the Fourier transform of fji. This enables 
a quick construction. 


3.2 Inference on 0 


We place a smoothness prior on h. For a given neural recording, the degree of smoothness, i.e. the 
timescales over which the latent modulators vary, is itself unknown. We therefore build a hierarchical 
model, wherein the smoothness is controlled by a set of hyperparameters, 6. In the Fourier domain, 
this amounts to parameterizing the prior power spectrum - i.e. the elements of the diagonal of L - 
as a function of 6. 

One potential form for a low-pass prior on h is the zero-mean ALDf prior (Automatic Locality 
Determination in the frequency domain [3]). This parameterizes the elements of the diagonal of L 
as a function of their respective Fourier frequencies, /(i), and two hyperparameters 6 = 

[L]ii = exp (8) 


In practice, ther e is co nsiderable computational pressure to minimize the number of coefficients of 
h* (see Section 4.1.1| ), which means we have to truncate this spectrum at some point. We find it 
more practical to construct a low-pass prior covariance from a more compact windowing function, 
such as a Blackman-Harris window, with hyperparameters 6 = {Fc, p}: 


/ e ^ Yfn=oi~^T«'n cos {TTn{l + f{i)/Fc)) : f{i)<Fc 

\ 0 : fii) > Fe 


where a = [0.35875, 0.48829, 0.14128, 0.01168]^. Compared with the truncated ALDf prior, this 
has less fiexibility in controlling the ripple in the estimate of h. Nevertheless, the difference in fitted 
h values are typically very small for a modest speed improvement. 

To choose values for the hyperparameters, we follow the approach of Park and Pillow [3, 4], and 
solve for the maximum marginal likelihood value of 0, a procedure also known as evidence opti¬ 
mization. The idea is to integrate out the values of h, and maximize the marginal likelihood of the 
data as a function of 6. This gives a point estimate of 0, as 0. We refer the reader to these papers 
for a complete description of this process. 


3.3 Inference on k 


In parallel to the approach in Section 3.1 we suppose we have an estimate of the modulators g(/i, 6), 
and wish to solve for the input-response parameters, q{k). The terms of the log joint containing k 
are: 


CJ = -F{xpky eyi-p{h) + y~^ log F{xt;k) + \ogp{k) + const (10) 

where the exponential and logarithm are applied element-wise. Next, we condition on our current 
estimate of g(/i, 6). To do this, we need only replace exp(/i) with its expectation under g(/i, 6) : 

i) (11) 

where the latter expression avoids explicitly constructing A. We calculate A* via an IFFT on 
the columns of A*, and we build R^ by taking the IFFT of the columns of . Inference on k 
then proceeds in a manner appropriate to the specific form of the function F, e.g. by maximizing 
^q{h,e) [^J] wrt. k. 

In practice, we find that two to three iterations of solving for q{h^ 6) given q{k), then q{k) given 
g(/i, 0), are sufficient for convergence. 


Eq(h,0) [exp(/i)] = exp ^ diag A j = exp ^ (^{R^ A*) O 
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4 Results 


4.1 Simulations 

4.1.1 Recovering the modulator 

We start by asking how well this inference procedure recovers underlying modulators when we know 
the ground truth. To answer this, we simulated a neuron’s spiking response to a stimulus ensemble. 
We assumed a simple input: the model neuron’s expected response to a stimulus of value x was 
described by a smooth tuning function f{x), which we discretized as a length-K vector, k. We 
also subjected the model neuron to a slow gain fluctuation (Fig. 3a). We set the ground truth for 
ktruG and ^true, and sampled the modulator, /itrue, and spikes, y, from the generative model, for 
T consecutive, pseudo-random presentations of the stimuli. We then performed complete inference 
of k, h, and 6. 



faster modulators 



Figure 3: Simulated data, illustrating inference on the latent modulator, (a) Top: we sampled a 
modulator (blue) for a given 6, and applied it to a model neuron responding to a stimulus. From 
the spiking output (bottom), we inferred q{h), which is overlaid on the top panel. Black: posterior 
mode, h; red: samples from the posterior, q{h). (b)-(c): Quality of recovery of /itrue^ measured as 
sim(/i, /itrue) = 100 X (1—Var(/i — /itrue) /Var(/itrue ))5 improves with several factors: increasing 
the cutoff frequency of the imposed modulators (abscissae); increasing the standard deviation of the 
imposed modulator ((b); factors of 1/4, 1/2, 1, and 4); and increasing the firing rate ((c); factors 
of 1/4, 1, and 4. (d) The computational cost of computing q{h\6) as a function of 6 (abscissa, 
measured via the length of h*), and the number of time windows (symbols). 


We reasoned that the quality of the modulator estimate would depend on how much data was avail¬ 
able to estimate each phase of the modulator, as well as the quality of that data. We tested this in 
three ways. First, for slow modulations, the posterior mode of the recovered modulator, h was very 
close to the ground truth, /itrue; but when the imposed modulators were too fast, and approached 
the sampling resolution of the time windows themselves, the fit quality declined. Second, when we 
reduced the magnitude of the fiuctuations in /itrue. the estimates h captured the slower modulations 
only (and 6 was more conservative than ^true)- Finally, we observed the same result when we de¬ 
creased the average firing rate (Fig. 3b-c). These results reflect an important feature of the inference 
procedure: by maximizing the marginal likelihood of 0, we learn the fastest timescale of modulation 
that the data are able to provide evidence for. 

The inference on h scales as 0{T log T + (T*)^), and is thus constrained by two factors (Fig. 
2d). First, as the number of observations grows, the FFT operations become more costly, regardless 
of the value of 6. Second, as the number of modulator coefficients (determined by 6) grows, the 
optimization of h* becomes more expensive. This cost appears to dominate as the hyperparameter 
admits higher frequencies. Since computation time is always limited, it is necessary to set some 
upper bound on the maximum length of h*, and hence on 6, for each dataset. Our experiments 
suggest that keeping the number of modulator coefficients below about 1000-2000 is a reasonable 
goal. This, however, means that any modulations at higher frequencies cannot be tractably learned 
using this algorithm. 


6 



















4.1.2 Recovering the input-response relationship 

Even if we are not interested in recovering the modulatory signal per se, including it in the model 
can yield a substantial improvement in estimates of the input-response parameters, k. 

To demonstrate this, we sampled from a simple generative model for the spiking responses of au¬ 
ditory cortical neurons, based on the Generalized Linear Model (GLM, [2]; Fig. 3). The auditory 
stimulus was a Dynamic Random Chord (DRC) ensemble as described in [5]. Briefly, this is a col¬ 
lection of simultaneously-presented tones; the levels of each tone were drawn independently every 
25 ms from a flxed distribution. We simulated neurons with a flxed set of parameters: a linear 
spectro-temporal receptive held (STRF) that was localized in time and frequency, a pointwise ex¬ 
ponential nonlinearity, and a spike feedback term that modulated the firing rate for a period of up 
to 200 ms after every spike. In addition, we added gain modulations of different kinds. We then 
fitted GLMs to the simulated spike trains, assuming an ALDs prior on the STRF, and an ALDsf 
prior on the spike history term [3]. We also added latent modulators to the models. For brevity, 
we refer to the GLMs fitted without the latent modulators as P-GLMs, as the only stochastic source 
in these models is the conditionally Poisson point process, while we refer to GLMs with the latent 
modulators as MoP-GLMs, as they additionally have a stochastic modulator process. 

In the absence of slow modulations, i.e. under conditions of stationary gain, the P-GLMs recovered 
the ground truth STRFs and spike history terms with a high degree of accuracy (Fig. 4a). However, 
when there was either a slowly varying gain (Fig. 4b) or a slow monotonic decrease in gain over time 
(Fig. 4c), the fitted spike history term for the P-GLMs included a near-constant positive offset. This 
artefact reflects the only means the model has to capture a slowly changing gain state: to smooth 
the last few seconds’ spike counts as a proxy measure of the modulator signal, and leverage this to 
change the predicted firing rate in the next time bin. Indeed, when we increased the speed of the 
imposed gain modulations, the estimated spike feedback kernels differed wildly from the ground 
truth (Fig. 4d). In all cases, switching to a MoP-GLM removed these biases, and recovered more 
accurate estimates of the spike-feedback kernels. 
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Figure 4: Simulated data, illustrating inference on k with and without the latent modulator, (a) 
Ground truth receptive held and spike-feedback kernel used to generate spikes according to a GLM 
forward model, (b)-(d) Top row: modulators applied to simulated neurons (blue); estimates h from 
the MoP-GLM model showing excellent recovery (black/red). Bottom: comparison of P-GLM 
(green) and MoP-GLM estimates (black/red) of spike-history kernels. Without accounting for the 
modulator, the inference on k is biased. 


4.2 Real neural data 

4.2.1 Improved predictive power 

To test the modulated-Poisson model in practice, we fit GLMs to real spiking data obtained from 
extracellular recordings of auditory neurons in the primary auditory cortex and midbrain of anes¬ 
thetized ferrets. The details of the data collection are described in [5, 9]. We fitted both P-GLMs 
and MoP-GLMs, as per the previous section, to approximately an hour of spiking responses to ran¬ 
dom chord stimulation from each of 339 recorded neurons. These data were recorded under typical 
non-stationary physiological conditions; we note, for instance, that the depth of anesthesia in such 
preparations is known to vary over time (though was not quantified here). 

We trained the GLMs on 80% of the data. We constructed the held-out 20% test set by selecting 
random 250 ms contiguous snippets from each recording, with a minimum of 250 ms training data 
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between each test snippet. By setting Ut' = Vt' = ^ for these snippets, inference on the training data 
simultaneously yielded predictive distributions on the modulators for the test data. Since the test 
snippets were very short in duration compared with the fastest tractable latent modulator timescale, 
the values of h were almost constant over each test snippet. In particular, we emphasize that while 
the MoP-GLMs have more fitted parameters than the P-GLMs, these extra parameters were fit ex¬ 
clusively to the training dataset: we introduced no new latent variables to describe the held-out 
data. 

Since ground truth is unknown for these datasets, we simply compared the likelihoods on the test 
datasets for the P-GLM and MoP-GLM model fits. For all 339 neurons, the MoP-GLMs produced 
better predictions of the held-out data than the P-GLMs. The improvement scaled with the amount 
of non-stationarity estimated in the spike train (Fig. 5a). Moreover, the model-estimated value of 
the modulator hyperparameters, 0, appeared optimal for prediction: when we fixed the modulator 
time constants to values higher or lower than the learned values, predictions were measurably worse 
(Fig. 5b). 



Figure 5: Real data, showing the improved predictive power of a model which includes a latent 
modulator, (a) Example multi-unit neural responses from auditory cortex [5]. Top left: spike counts. 
Remaining panels show fits of the components of MoP-GLM to this cell, (b) Improved predictions 
on held-out data from 339 units, measured as the likelihood ratio per second between the MoP- 
GLM and the P-GLM. The difference was greatest for cells with more estimated modulation (r = 
0.6). (c) Median improvement of MoP-GLM over Poisson-GLM across the population, when the 
cutoff frequency was set to a value higher or lower than the learned optimum. This shows the local 
sensitivity to mis-estimation of 6. 


5 Discussion 

We developed a modulated-Poisson model, that includes a smooth, latent modulatory signal, which 
is combined multiplicatively with a stimulus-driven firing rate, as well as an inference procedure for 
estimating all parameters given noisy neural data. Our application of this model to simulated data 
indicates that the inference recovers an underlying modulator of the neuron’s firing rate, so long as 
this modulator varies on a sufficiently slower time scale than the windows over which spikes are 
counted. On the other hand, if we exclude the modulator from our models, our simulations show 
that this can lead to substantial biases in the estimation of stimulus-response parameters. When we 
apply this model to real data from the auditory midbrain and cortex of the ferret, we find that cross- 
validated predictions of neural responses improve. We believe that the tools we have developed here 
offer a principled solution to a long-standing problem in experimental neuroscience: how to build 
analysis for a non-stationary brain. 
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